Vision Based Control for Space Applications

نویسندگان

Konstantinos KAPELLOS

Francois CHAUMETTE

Maarten VERGAUWEN

Andrea RUSCONI

Luc JOUDRIER

چکیده

This paper presents the work performed in the context of the VIMANCO ESA project. It has the objective of improving the autonomy, safety and robustness of robotics system using vision. The approach we propose is based on an up-to-date recognition and 3D tracking method that allows to determine if a known object is visible on only one image, to compute its pose and to track it in real time along the image sequence acquired by the camera, even in the presence of varying lighting conditions, partial occlusions, and aspects changes. The robustness of the proposed method has been achieved by combining an efficient low level image processing step, statistical techniques to take into account potential outliers, and a formulation of the registration step as a closed loop minimization scheme. This approach is valid if only one camera observes the object, but can also be applied to a multi-cameras system. Finally, this approach provides all the necessary data for the manipulation of non cooperative objects using the general formalism of visual servoing, which is a closed loop control scheme on visual data expressed either in the image, or in 3D, or even in both spaces simultaneously. This formalism can be applied whatever the vision sensor configuration (one or several cameras) with respect to the robot arms (eyein-hand or eye-to-hand systems). The global approach has been integrated and validated in the Eurobot testbed located at ESTEC. 2. Background, Objectives and Overall Approach Future Space Automation and Robotics applications require the use of vision to perform their calibration and the required precise interactions with the environment. Consequently, vision is compulsory to increase the autonomy of the space robotics agents. The VIMANCO activity is mainly targeted to EUROBOT, whose purpose is to prepare and assist EVAs on the International Space Station. A typical scenario for the EUROBOT is to place an APFR (Adjustable Portable Foot Restraint) at given locations on the ISS. This involves walking on the handrails and inserting the APFR into a specific fixture called a WIF. In this context, vision is an enabling technology both for the autonomy and the safety of EUROBOT: First, although the positions of the handrails and fixtures are well known, there will be some inaccuracy in the placement of the robot, increasing with movements. Vision processing of images would allow the EUROBOT to know the precise positions of the objects to grasp and where to insert or place them. This is a prerequisite to perform the grasping or insertion task itself. Second, object recognition would provide the EUROBOT with the ability to check the environment with respect to its a priori knowledge and detect discrepancies. Extending this concept, it would allow the EUROBOT to “know” position of astronauts with respect to itself, representing very valuable information for advanced safety functionalities. The European Robotic Arm (ERA) already performs insertion tasks using vision, however, it requires a specific visual target to process the position of the objects to grasp. In the case of the EUROBOT, it is not possible to put a target on every single object. Vision has therefore to cope with non-cooperative objects, i.e. objects that are not equipped with optical markers. The use of vision in space has to tackle several specific problems and in particular the extreme light difference in images. This means that direct sunlight makes objects appear very bright while shadows are totally dark. Vision algorithms must be very robust in coping with effect of shadows moving in the imaged scene to allow safe and stable manipulation at anytime. Another major space problem is lack of computing power for processing images. Resource (i.e. energy, volume, mass) and environmental constraints (i.e. thermal dissipation, radiation compatibility) limit performance of computers that may be used in space. In this framework, the main objectives of the VIMANCO activity are first, to define a Vision System Architecture applicable to EUROBOT taking into account the characteristics of the EUROBOT environment and the applicability of the vision techniques to the EUROBOT operations, second to implement a Vision Software Library allowing Vision Control for Space Robots and finally to breadboard the specific HW/SW and to demonstrate it on the ESTEC EUROBOT testbed. Figure 1: For Vision Based Manipulation of a noncooperative object three steps are required: Object Recognition, Object Tracking and Visual Servoing Figure 2: VIMANCO system architecture To meet the Vision Control objectives, various steps are then required: first the object of interest has to be detected and recognized in the image acquired by the camera. This recognition step must also provide a coarse localization of the object in both 2D and 3D with respect to the camera. Usually this recognition step is time consuming and the result of the localization is not precise enough to be considered for controlling the robot. Therefore we propose to consider a tracking process. Once the object is known, it is possible to track it, over frames and at video rate, using 3D modelbased tracking algorithms. These algorithms can use a unique camera but can also consider stereo with small or wide baseline. Finally the output of this algorithm (precise 2D and 3D localization) can be used to control the movement of the robot according to a predefined task. We now describe the three different steps. Figure 2 illustrates the global VIMANCO system design. It is composed by: The Vision System. It consists of a stereo pair of cameras attached on a mechanical support and two independent cameras. The stereo camera and each of the two independent cameras dispose an illumination device. The Vision System Simulator. It is a 3D graphic tool used to reconstruct the robotic system and its environment to produce virtual images. It simulates as faithful as possible the Vision System mounted on the targeted robotic system. The Vision Processing and Object Recognition Library. It implements all functionality needed for Object Recognition, Object Tracking and Visual Servoing. It provides also the means to control their execution and to communicate with the other systems. The A&R Simulator. It replaces the robot controller functionality that is needed to validate and demonstrate the whole approach. In real operations the A&R Simulator is replaced by the corresponding A&R Controller. The Control Station. It provides the HMIs that allow the operator to run the VIMANCO simulations. It allows activating Actions/Tasks on the A&R Simulator, to configure, monitor and control the Vision System Simulator and to visualize the acquired images. 3. Object Recognition The goal of Object recognition is, as clearly stated in the name, the art of finding back specific objects when seen in new situations or different images. It is a subject that has been studied in computer vision since its early days (about fifty years ago). Most systems in those days worked rather ad-hoc on simplified objects such as polygons and polyhedrons. The research community has come a long way since then. In general the task of recognizing an object is made difficult because of the possible variability of the camera’s internal parameters, its position and orientation, the illumination conditions and even the constellation of the visible objects. In the VIMANCO activity Object Recognition has been designed and implemented as follows: During the mission preparation phase, the off-line training performed in which the objects to be recognized are modeled using features and their corresponding feature descriptors and 3D coordinates During operations, the on-line object recognition is based on the same procedure of feature extraction and description and an additional feature matching and verification step. During the off-line training the Object Recognition component describes the objects to be recognized using local invariant features. A specific application with a HMI front-end is employed for this. The algorithm and corresponding data flow of this application are shown in Figure 3. Figure 3: Object Recognition: Off-line training Figure 4: On-line Object Recognition The input to the system consists of images of the object to be modelled. For every image a set of features is extracted first, using the Feature Extraction component. These will typically be affine invariant or rotation-scale invariant features like MSER, IBR, SIFT or SURF regions. When these features are extracted, a feature descriptor can be computed for each of them. The Feature Description component is used to this end. The output consists of a feature descriptor for each feature, containing the description of this feature. In order to initialize the camera pose for Object Tracking, 3D-2D correspondences will be computed later. These can be used to compute the camera pose w.r.t. the object. Since feature matching is performed in the images, we need to assign 3D coordinates to every feature. We do so using a specifically dedicated graphical tool to Assign 3D-Coordinates. As depicted in Figure 3, the data-flow between the 3 components of the off-line training step is straightforward. An image of the object is the only input of the system. The location of the features is an extra input for the description phase. In order to compute the 3D coordinates of the features, a (simplified) 3D model of the object is needed as well. During real operation, the system needs to identify objects in the image or certify their presence. This is the goal of the object recognition phase, implemented in the Object Recognition activity. The data flow diagram of this activity is shown in Figure 4. We recognize the first two components of this phase. The Feature Extraction and Feature Description components are identical to those in the off-line training phase. Indeed, the first step in recognizing an object in an image consists of locating features in this image and describing these features using the same algorithm as before. The newly found feature descriptors can then be matched to the feature descriptors of the objects in the database. This is done in the Feature Matching component. The result of this component consists of matches between features, i.e. 2D-2D correspondences. These results can contain mismatches, while other (correct) matches might have been missed. This can be ameliorated by the Verification component, which will output its result in the form of matches between 3D coordinates (found by the Assign 3D Coordinates component in the preprocessing phase) and 2D coordinates (of the features extracted in the target image. These 3D-2D correspondences can be used to compute an initialization of the camera pose w.r.t. the object. The data flow is clear from Figure 4. Features and feature descriptors are extracted from the target image and are matched to the model features, i.e. the features extracted in the model images during the off-line training phase. The resulting 2D-2D correspondences are checked in the verification step to yield 3D-2D correspondences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dual Space Control of a Deployable Cable Driven Robot: Wave Based Approach

Known for their lower costs and numerous applications, cable robots are an attractive research field in robotic community. However, considering the fact that they require an accurate installation procedure and calibration routine, they have not yet found their true place in real-world applications. This paper aims to propose a new controller strategy that requires no meticulous calibration and ...

متن کامل

Comparison of Different Targets Used in Augmented Reality Applications in Ubiquitous GIS

Drilling requires accurate information about locations of underground infrastructures or it can cause serious damages. Augmented Reality (AR) as a technology in Ubiquitous GIS (UBIGIS) can be used to visualize underground infrastructures on smartphones. Since smartphone’s sensors do not provide such accuracy, another approaches should be applied. Vision based computer vision systems are well kn...

متن کامل

Ball Trajectory Estimation and Robot Control to Reach the Ball Using Single Camera

In robotics research, catching a projectile object with a robotic system is one of the challenging problems. The outcome of these researches can be used in a wide range of applications such as video surveillance systems, analysis of sports videos, monitoring programs for human activities, and human-machine interactions. In this paper, we propose a new vision-based algorithm to estimate the traj...

متن کامل

Designing a Robust Control Scheme for Robotic Systems with an Adaptive Observer

This paper introduces a robust task-space control scheme for a robotic system with an adaptive observer. The proposed approach does not require the availability of the system states and an adaptive observer is developed to estimate the state variables. These estimated states are then used in the control scheme. First, the dynamic model of a robot is derived. Next, an observer-based robust contr...

متن کامل

Human Computer Interaction Using Vision-Based Hand Gesture Recognition

With the rapid emergence of 3D applications and virtual environments in computer systems; the need for a new type of interaction device arises. This is because the traditional devices such as mouse, keyboard, and joystick become inefficient and cumbersome within these virtual environments. In other words, evolution of user interfaces shapes the change in the Human-Computer Interaction (HCI). In...

متن کامل